-
Notifications
You must be signed in to change notification settings - Fork 11
Add check for duration of time columns in DynamicTable #628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
…outBorders/nwbinspector into dyn-table-duration
…outBorders/nwbinspector into dyn-table-duration
|
@stephprince this PR has been sitting for over a month. Can you please review it? |
|
Yes I can take a look this afternoon! Sorry about that - thanks for the ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bendichter added my review - can you also add a best practice documentation section?
| # Check for timestamp column (possibly with duration) | ||
| if "timestamp" in table.colnames and len(table["timestamp"]) > 0: | ||
| timestamp_data = table["timestamp"] | ||
| start_times.append(float(timestamp_data[0])) | ||
|
|
||
| if "duration" in table.colnames and len(table["duration"]) > 0: | ||
| duration_data = table["duration"] | ||
| end_times.append(float(timestamp_data[-1] + duration_data[-1])) | ||
| else: | ||
| end_times.append(float(timestamp_data[-1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we recommend storing data this way somewhere? This seems relatively arbitrary to look for (vs. the start_time/end_time and spike_times columns we know exist in DynamicTables from the schema).
| # Check for timestamp column (possibly with duration) | |
| if "timestamp" in table.colnames and len(table["timestamp"]) > 0: | |
| timestamp_data = table["timestamp"] | |
| start_times.append(float(timestamp_data[0])) | |
| if "duration" in table.colnames and len(table["duration"]) > 0: | |
| duration_data = table["duration"] | |
| end_times.append(float(timestamp_data[-1] + duration_data[-1])) | |
| else: | |
| end_times.append(float(timestamp_data[-1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some other inspector checks that search for column names ending with _time. I think those could be checked for duration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for ndx-events. I should have mentioned that somewhere, huh?
| start_times.append(float(table["start_time"][0])) | ||
| if "stop_time" in table.colnames and len(table["stop_time"]) > 0: | ||
| end_times.append(float(table["stop_time"][-1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also check max/min of all start and stop times if we don't want to assume that these columns are also following ascending order best practices
| def test_check_table_time_columns_duration_with_timestamp(): | ||
| """Test with timestamp column.""" | ||
| table = DynamicTable(name="events", description="test events") | ||
| table.add_column(name="timestamp", description="event timestamps") | ||
| table.add_row(timestamp=0.0) | ||
| table.add_row(timestamp=100.0) | ||
|
|
||
| assert check_table_time_columns_duration(table) is None | ||
|
|
||
|
|
||
| def test_check_table_time_columns_duration_with_timestamp_and_duration(): | ||
| """Test with timestamp and duration columns.""" | ||
| one_year = 31557600.0 | ||
| table = DynamicTable(name="events", description="test events") | ||
| table.add_column(name="timestamp", description="event timestamps") | ||
| table.add_column(name="duration", description="event durations") | ||
| table.add_row(timestamp=0.0, duration=10.0) | ||
| table.add_row(timestamp=one_year, duration=1000.0) | ||
|
|
||
| result = check_table_time_columns_duration(table) | ||
| assert result is not None | ||
| assert "exceeds the threshold" in result.message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment in _tables.py about these column names and whether we should keep these tests
| assert check_table_time_columns_are_not_negative(test_table) is None | ||
|
|
||
|
|
||
| def test_check_table_time_columns_duration_pass_short(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests for the Units table / spike_times case?
Co-authored-by: Steph Prince <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #628 +/- ##
==========================================
+ Coverage 73.03% 76.51% +3.47%
==========================================
Files 47 47
Lines 1587 1622 +35
==========================================
+ Hits 1159 1241 +82
+ Misses 428 381 -47
🚀 New features to boost your workflow:
|
#627